You have a linked list and want to find the th to last node.
Write a function kth_to_last_node() that takes an integer and the head_node of a singly-linked list, and returns the th to last node in the list.
For example:
class LinkedListNode:
def __init__(self, value):
self.value = value
self.next = None
a = LinkedListNode("Angel Food")
b = LinkedListNode("Bundt")
c = LinkedListNode("Cheese")
d = LinkedListNode("Devil's Food")
e = LinkedListNode("Eccles")
a.next = b
b.next = c
c.next = d
d.next = e
# Returns the node with value "Devil's Food" (the 2nd to last node)
kth_to_last_node(2, a)
We can think of this two ways.
First approach: use the length of the list.
def kth_to_last_node(k, head):
if k < 1:
raise ValueError(
'Impossible to find less than first to last node: %s' % k
)
# Step 1: get the length of the list
# Start at 1, not 0
# else we'd fail to count the head node!
list_length = 1
current_node = head
# Traverse the whole list,
# counting all the nodes
while current_node.next:
current_node = current_node.next
list_length += 1
# If k is greater than the length of the list, there can't
# be a kth-to-last node, so we'll return an error!
if k > list_length:
raise ValueError(
'k is larger than the length of the linked list: %s' % k
)
# Step 2: walk to the target node
# Calculate how far to go, from the head,
# to get to the kth to last node
how_far_to_go = list_length - k
current_node = head
for i in range(how_far_to_go):
current_node = current_node.next
return current_node
Second approach: maintain a -wide "stick" in one walk down the list.
def kth_to_last_node(k, head):
if k < 1:
raise ValueError(
'Impossible to find less than first to last node: %s' % k
)
left_node = head
right_node = head
# Move right_node to the kth node
for _ in range(k - 1):
# But along the way, if a right_node doesn't have a next,
# then k is greater than the length of the list and there
# can't be a kth-to-last node! we'll raise an error
if not right_node.next:
raise ValueError(
'k is larger than the length of the linked list: %s' % k
)
right_node = right_node.next
# Starting with left_node on the head,
# move left_node and right_node down the list,
# maintaining a distance of k between them,
# until right_node hits the end of the list
while right_node.next:
left_node = left_node.next
right_node = right_node.next
# Since left_node is k nodes behind right_node,
# left_node is now the kth to last node!
return left_node
In either approach, make sure to check if k is greater than the length of the linked list! That's bad input, and we'll want to raise an error.
Both approaches use time and space.
But the second approach is fewer steps since it gets the answer "in one pass," right? Wrong.
In the first approach, we walk one pointer from head to tail (to get the list's length), then walk another pointer from the head node to the target node (the th to last node).
In the second approach, right_node also walks all the way from head to tail, and left_node also walks from the head to the target node.
So in both cases, we have two pointers taking the same steps through our list. The only difference is the order in which the steps are taken. The number of steps is the same either way.
However, the second approach might still be slightly faster, due to some caching and other optimizations that modern processors and memory have.
Let's focus on caching. Usually when we grab some data from memory (for example, info about a linked list node), we also store that data in a small cache right on the processor. If we need to use that same data again soon after, we can quickly grab it from the cache. But if we don't use that data for a while, we're likely to replace it with other stuff we've used more recently (this is called a "least recently used" replacement policy).
Both of our algorithms access a lot of nodes in our list twice, so they could exploit this caching. But notice that in our second algorithm there's a much shorter time between the first and second times that we access a given node (this is sometimes called "temporal locality of reference"). Thus it seems more likely that our second algorithm will save time by using the processor's cache! But this assumes our processor's cache uses something like a "least recently used" replacement policy—it might use something else. Ultimately the best way to really know which algorithm is faster is to implement both and time them on a few different inputs!
Can we do better? What if we expect to be huge and to be pretty small? In this case, our target node will be close to the end of the list...so it seems a waste that we have to walk all the way from the beginning twice.
Can we trim down the number of steps in the "second trip"? One pointer will certainly have to travel all the way from head to tail in the list to get the total length...but can we store some "checkpoints" as we go so that the second pointer doesn't have to start all the way at the beginning? Can we store these "checkpoints" in constant space? Note: this approach only saves time if we know that our target node is towards the end of the list (in other words, is much larger than ).
We listed two good solutions. One seemed to solve the problem in one pass, while the other took two passes. But the single-pass approach didn't take half as many steps, it just took the same steps in a different order.
So don't be fooled: "one pass" isn't always fewer steps than "two passes." Always ask yourself, "Have I actually changed the number of steps?"
Do you have an answer?
Wanna review this one again later? Or do you feel like you got it all?
Mark as done Pin for review laterReset editor
Powered by qualified.io